EduQG: A Multi-Format Multiple-Choice Dataset for the Educational Domain

نویسندگان

چکیده

Natural language processing technology has made significant progress in recent years, fuelled by increasingly powerful general models. This also inspired a sizeable body of work targeted specifically towards the educational domain, where creation questions (both for assessment and practice) is laborious/expensive effort. Thus, automatic Question-Generation (QG) solutions have been proposed studied. Yet, according to survey QG community’s progress, common baseline dataset unifying multiple domains question forms (e.g., choice vs. fill-the-gap), including readily available models compare against, largely missing. gap we aim fill with this paper. In particular, introduce high-quality containing over 3,000 entries, comprising (i) multiple-choice questions, (ii) corresponding answers (including distractors), (iii) associated passages from course material used as sources questions. Each phrased two forms, normal cloze (i.e., correct are linked source documents sentence-level annotations. our versatile can be both distractor generation, well explore new challenges such format conversion. Furthermore, 903 accompanied their cognitive complexity level per Bloom’s taxonomy. All generated experts rather than crowd workers ensure they maintaining learning standards. Our analysis experiments suggest distinguishable differences between commonly ones generation purposes. We believe serve valuable resource research evaluation domain. The baselines support further education (https://github.com/hadif ar/question-generation).

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effect of the Educational Pamphlet on the Quality of Multiple-Choice Questions

Background and Aim: The aim of this study was to evaluate the structural integrity of multiple-choice questions according to the Millman checklist, and to assess thedistinguishing power of these questions between weak and strong students in theoretical courses of Periodontics in an academic year (2014-15) in Dental Branch of Tehran Islamic Azad University. Materials and Methods: A total of...

متن کامل

A Fair Power Allocation for Non-Orthogonal Multiple Access in the Power Domain

This paper presents an investigation on the performance of the Non-Orthogonal Multiple Access (NOMA) in the power domain scheme. A Power Allocation (PA) method is proposed from NOMA throughput expression analysis. This method aims to provide fair opportunities for users to improve their performance. Thus, NOMA users can achieve rates higher than, or equal to, the rates obtained with the convent...

متن کامل

A Linked Dataset of medical educational resources

Reusable educational resources became increasingly important for enhancing learning and teaching experiences, particularly in the medical domain where resources are particularly expensive to produce. With respect to this, research has aimed at improving interoperability across educational resources metadata repositories, which led to a fragmented landscape of competing metadata schemas, such as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3248790